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Descripti n 

1.0. Field of the invention 

[0001] The present invention relates to hybrid toxin fragments, and toxins comprising them, derived from Bacillus 
thuringiensis insecticidal crystal proteins. 

1.1. Description of the related art 

[0002] Bacillus thuringiensis (hereinafter B.t.) is a gram-positive bacterium that produces insecticidal crystal proteins 
during sporulation. The crystal (Cry) proteins form a large (over 160 and growing) family of homologous proteins with 
unique specificities. Each protein is active against only one or a few insect species. Most reported proteins are active 
against lepidopterans, with a smaller number showing activity against diptera or coleoptera as reviewed in Schnepf et 
al., 1 998). While initially the Cry proteins were classified according to their activity against one of the abovementioned 
insect orders (Hofte and Whiteley, 1989), the more recent, now commonly accepted classification is based on amino 
acid homology (Crickmore era/., 1998). 

[0003] The mode of actions of crystal proteins has been partially elucidated. After ingestion by the insect larvae, 
crystals dissolve in the midgut environment, releasing the proteins as protoxins of 70-140 kDa. The solubilized protoxins 
are subsequently processed ("activated") by midgut proteases, resulting in a protease-resistant fragment of about 65 
kDa, which is the active toxin. The toxin binds to receptors on epithelial cells of the insect midgut and penetrates the 
membrane. This eventually leads to lysis of the cells and death of the larvae. 

[0004] The activity range of a particular delta-endotoxin is to a large extent determined by the occurrence of receptors 
on the midgut cells of the insect, although solubilization efficiency and proteolytic activition are also factors involved. 
The importance of binding to receptors is further examplified by the decrease in binding occuring in many instances 
of resistance to Cry proteins (Ferre et al., 1995). 

[0005] Structure determination by X-ray crista Hog raphy has shown that three different Cry proteins, and probably all 
Cry proteins, share a common three domain-structure ( Li et al., 1991; Grochulski et al., 1995; Morse era/., 1998). If 
projected on Cry1 sequences, domain I runs from about amino acid residue 28 to 260, domain II from about 260 to 
460 and domain III from about 460 to 620. Since the various toxins have different lengths, the borders of the domains 
can be defined only approximately. A person skilled in this art will be able to find the borders for the various toxins by 
comparing the amino acid sequences. The N-terminal domain I consists of 7 oc-helices and is considered to be inserting 
(partially) into the target membrane, and forming part of the pore that eventually kills the insect gut epithelial cells. Both 
domain II and the C-terminal domain III are very variable, and have been shown to determine activity against specific 
insects. Although it is not yet clear how these domains individually or acting together may determine specificity, there 
is strong evidence that both can be involved in binding to (putative) receptors (Lee et a/., 1995; Dean et al., 1996; de 
Maagd etai, 1996b; de Maagd era/., 1999). 

[0006] Exchange of domain III between toxins by in vivo recombination of their encoding genes may not only alter 
specificity of a toxin, but can also result in a hybrid toxin with superior toxicity for certain insects (Bosch et a/., 1994; 
de Maagd etai, 1 996a). Intl. Pat. AppL Publ. No. WO 95/06730 discloses the construction of a hybrid delta-endotoxin 
consisting of domains I and II of CrylE, and domain III and protoxin-specific fragment of CrylC. In bioassays, this 
hybrid as purified after production in E. coli is active against Manduca sexta, Spodoptera exigua, and Mamestra brassh 
cae. When expressed in and purified from a recombinant Bacillus thuringiensis strain, the 1E/1C-hybrid was 1.5 times 
as active as the most active natural toxin against S. exigua, CrylC. The abovementioned patent application also de- 
scribes a hybrid delta-endotoxin consisting of domains I and II of CrylAb, and domain III of CrylC. When purified from 
a recombinant Bacillus thuringiensis strain, this Cry1 Ab/Cry1 C-hybrid was approximately 6.6 times as toxic as CrylC 
(de Maagd ef al., 1 996a). Intl. Pat. Appl. Publ. No. WO 98/22595 discloses a number of hybrid toxins consisting of 
different combinations of fragments of CrylAb, Cry 1 Ac, CrylC or CrylF, of which some have improved acitivity against 
important pest larvae and/or an extended activity spectrum. 

[0007] Producing succesfull hybrids is not easy. From the prior art it appears that only a very limited number of hybrids 
have an improved activity or a broader host-range specificity. In addition there are many possibilities for recombinantly- 
engineered crystal proteins, in view of the big number of different crystal proteins (more than 160). Further, the effect 
of the hybrids produced is not predictable. 

2.0 Summary of the invention 

[0008] Proteins of the Cry3 (Herrnstadt et al., 1986; McPherson et al, 1988), Cry7 (Lambert et ai, 1992) and Cry8 
(Sato et ai, 1994) classes were found to be active against insects of the order Coleoptera (beetles). Cry3A is the 
singlemost active protein for the important potato pest Colorado potato beetle (CPB), Leptinotarsa decemlineata (Say). 
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[0009] It is known that resistance to the endotoxins may occur. Accordingly , resistance to Cry3A may occur in CPB 
in response to its exposure to Cry3A in transgenic potato. Actually, resistant CPB has already been disclosed. There- 
fore, it is desirable to have an alternative or replacement for Cry3A. 

[0010] The present inventors have found specific hybrid crystal proteins which have a high activity against CPB and 
5 therefore can be used as a replacement for Cry3A. The hybrids of the invention comprise domains derived from Cry 1 la 
and CrylBa. 

[0011] Cry1 proteins are generally active against lepidopterans (larvae of moths and butterflies). CrylB and Cryll 
have been shown to also have some activity against coleopterans, among which CPB, although their toxicity for CPB 
is much lower than that of Cry3A (Tailor et a/., 1992; Bradley et a/., 1995). 
10 [0012] The toxicity of the hybrid crystal proteins of the invention against CPB, as compared to the parental proteins 
Cry1 Ba and Cry1 la, is surprisingly higher. Additionally a number of the hybrids of the invention have retained the high 
activity against some Lepidopterans, making them toxins with meaningful dual activity for Coleopterans and Lepidop- 
terans. 

[0013] According to the present invention there is provided a B.t. hybrid toxin fragment comprising structural domains 
15 |, || and III in this order starting from the N-terminal, wherein the domains are derived from at least two different Cry 
proteins, domain I is domain I of any of 8. t. Cry protein or a part of said domain or a peptide substantially similar to 
said domain , domain II is domain II of Cry1 la or a part of said domain or a peptide substantially similar to said domain, 
and domain III is domain III of Cry1 Ba or a part of said domain or a peptide substantially similar to said domain. Preferred 
is a fragment which comprises domain I of Cry1 la or Cry 1 Ba or a part of said domain or a peptide substantially similar 
20 to said domain, domain II of Crylla or a part of said domain or a peptide substantially similar to said domain, and 
domain III of Cry1 Ba or a part of said domain or a peptide substantially similar to said domain. 
[0014] The term " or a peptide substantially similar to said domain" should be understood to mean a peptide having 
an amino acid sequence which is at least 85% similar to the sequence of the domain. It is preferred that the degree of 
similarity is at least 90%. 

25 [0015] In the context of the present invention, two amino acid sequences with at least 85% or 90% similarity to each 
other have at least 85% or 90% identical or conservatively replaced amino acid residues in a like position when aligned 
optimally allowing for up to 6 gaps with the proviso that in respect of the gaps a total not more than 15 amino acid 
residues are affected. For the purpose of the present invention conservative replacements may be made between 
amino acids with the following groups: 

30 

1 . Serine and Threonine; 

2. Glutamic acid and Aspartic acid; 

3. Arginine and Lysine 

4. Asparagine and Glutamine 

35 5. Isoleucine, Leucine, Valine, and Methionine; 

6. Phenylalanine, Tyrosine, and Tryptophan 

7. Alanine and Glycine 

By "or a part of said domain" is meant a peptide comprised by the said domain and having at least 80% of the consecutive 
40 sequence thereof. 

[0016] It is most particularly preferred that the toxin fragment according to the invention comprises either: 

1. An amino acid sequence from about amino acid 20 to about amino acid 641 in SEQ ID NO:2 or 

2. An amino acid sequence from about amino acid 20 to about amino acid 632 in SEQ ID NO:4. 

45 

[0017] The invention also includes a hybrid toxin comprising the above disclosed fragment or a toxin at least 85% 
similar to such a hybrid toxin which has substantially similar insecticidal activity. 

[001 8] The hybrid toxin comprises the three structural domains as defined above and generally may comprise a pro- 
toxin segment at the carboxyl end, which is not toxic and is thought to be important for crystal formation. SEQ ID NO: 
50 2 and 4 comprise such pro-toxin segment. Further, the hybrid toxin may comprise a lead sequence, being amino acids 
1-19 in SEQ ID NO:2 and 4. 

[0019] The invention still further includes pure proteins which are at least 90% identical to the toxin fragments or 
hybrid toxins according to the invention. 

[0020] The invention still further includes recombinant DNA comprising a sequence encoding a protein having an 
55 amino acid sequence of the above disclosed toxins or fragments thereof. 

[0021] In a preferred embodiment the invention provides recombinant DNA comprising the sequence as shown in 
SEQ ID NO:1 or 3 or DNA similar thereto encoding a substantially similar protein. 

[0022] In a more preferred embodiment the invention provides recombinant DNA comprising the sequence from 



3 
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about nucleotide 170 to about 1929 in SEQ ID NO:1 or from about nucleotide 147 to about 1896 in SEQ ID NO:3. 
[0023] By similar DNA is meant a test sequence which is capable of hybridizing to the inventive recombinant se- 
quence. When the test and inventive sequences are double stranded the nucleic acid constituting the test sequence 
preferably has a T m within 20° C of that of the inventive sequence. In the case that the test and inventive sequences 
are mixed together and denantured simultaneously, The T m values of the sequences are preferably within 10°C of 
each other. More preferably the hybridization is performed under stringent conditions, with either the test or inventive 
DNA preferaby being supported. Thus either a denatured test or inventive sequence is preferably first bound to a 
support and hybridization is effected for a specified period of time at a temperature of between 50 and 70 °C in double 
strength citrate buffered saline containing 0.1 % SDS followed by rinsing of the support at the same temperature but 
with a buffer having a reduced SSC concentration. Depending upon the degree of stringency required, and thus the 
degree of similarity of the sequences, such reduced concentration buffers are typically single strength SSC containing 
0.1 % SDS. Sequences having the greatest degree of similarity are those the hybridization of which is least affected 
by washing in buffers of reduced concentration. It is most preferred that the test and inventive sequences are so similar 
that the hybridization between them is substantially unaffected by washing or incubation in one tenth strength sodium 
citrate buffer containing 0.1% SDS. 

[0024] The recombinant DNA may further encode a protein having herbicide resistance, plant growth-promoting, 
anti-fungal , anti-bacterial, anti-viral and/or anti-nematode properties. In the case that the DNA is to be introduced into 
a heterologous organism it may be modified to remove known mRNA instability motifs (such as AT rich regions) and 
polyadenylation signals, and/or codons which are preferred by the organism into which the recombinant DNA is to be 
inserted may be used so that expression of the thus modified DNA in the said organism yields substantially similar 
protein to that obtained by expression of the unmodified recombinant DNA in the organism in which the protein com- 
ponents of the hybrid toxin or toxin fragments are endogenous. 

[0025] The invention still further includes a DNA sequence which is complementary to one which hybridizes under 
stringent conditions with the recombinant DNA according to the invention. 

[0026] Also included in the present invention are: a vector containing such a recombinant (or complementary thereto) 
DNA sequence; a plant or micro-organism which includes, and enables expression of such DNA; plants transformed 
with such DNA; the progeny of such plants which contain the DNA stably incorporated and hereditable in a Mendelian 
manner, an/or the seeds of such plants and such progeny. The invention still further includes protein derived from 
expression of the said DNA, and insecticidal protein produced by expression of the recombinant DNA within plants 
transformed therewith. 

[0027] The invention still further includes an insecticidal composition containing one or more of the toxin fragments 

or toxins comprising them according to the invention; a process for combatting insects which comprises exposing them 

to such fragments or toxins or compositions, and an extraction process for obtaining insecticidal proteins from organic 

material containing them comprising submitting the material to maceration and solvent extraction. 

[0028] The invention will be further apparent from the following description, which describes the production of Bt 

hybrid toxin fragments according to the invention, taken in conjunction with the associated drawings and sequence 

listings. 

[0029] A person skilled in the art may use different methods to construct the hybrid toxin genes of the invention. 
Domain encoding regions may be exchanged between two homologous, though different genes by exchanging frag- 
ments through restriction enzyme digestion and subsequent ligation, if the same restriction enzyme recognition sites 
are present at the same position in the two genes. If suitable restriction enzyme recognition sites are not available, 
new sites can be created at homologous positions in the two genes by various methods of site-directed mutagenesis, 
as described in the examples below for construction of a common Rsrll-site using DNA oligomers SEQ ID: 9 and SEQ 
ID: 10. 

[0030] Alternatively, fragments that are to be combined in a hybrid toxin gene may be produced by PCR-mediated 
amplification using the original genes as templates, taking care that the ends of the fragments contain compatible 
restriction enzyme digestion sites. Furthermore, a hybrid toxin encoding DNA fragment may be produced by in vitro 
recombination between overlapping, partially homologous DNA fragments as performed in so-called DNA shuffling 
experiment (Crameri et al, 1998; Zhao et al., 1998). Also, hybrid toxin encoding DNA fragments may be produced by 
in vivo recombination between two homologous DNA fragments that have been cloned in tandem on a single plasmid 
(Bosch et al., 1994), followed by screening for the desired recombination events. 

3.0 Brief description of the drawings 

[0031] Figure 1 shows the sequence of the relevant part of the crylBa (top) and crylla (bottom) genes, respectively, 
aligned with the respective oligomers ( SEQ ID NO:9 and 10) used for mutagenesis in order to produce a common 
Rsrti restriction enzyme recognition site in both genes. Mutated nucleotides are indicated by asterisks and the newly 
produced Rsrti recognition sites are underlined. 
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[0032] FIgur 2 shows both the crylla gene (solid bar) as well as the crylBa fragment (open bar) present d as they 
occur in expression vectors pSN18 and pSN17, respectively. Both genes are inserted in the pKK233-2 derived expres- 
sion vector for E. coli, pBD12. For construction of pSN18 the 3' part of the crylBa gene, encoding the C-terminal part 
of the protoxin-specific fragment, was replaced by the corresponding part of the cryICa gene (base pairs 2038-3627; 
dashed bar). The positions of the common Nco\ and Mun\ restriction sites, as well as of the Rsr\\ sites derived by 
mutagenesis, and the 8sfcXl-site used for exchange of the 3'end of the crylBa gene are indicated, with nucleotide 
position numbers below the respective genes. 

[0033] Figure 3 shows in a scheme the construction of hybrid gene SN15 from crylBa (pSN17) and crylla (pSN18), 
and the subsequent construction of hybrid gene SN19 from crylBa (pSN17) and pSN15. 

[0034] Figure 4 shows the alignment of crylBa and crylla nucleotide sequences, as present in plasmids pSN17 
and pSN18, respectively, around the junctions between the domain II and III (A) and between domain I and II (B). 
Nucleotide sequences as present in the hybrid genes of SN15 and pSIM19 (A) or of pSN19 alone (B) are given in 
capitals, while identical nucleotides in the DNA that is not present in SN15 and SN19 is represented by dots. The 
position of the common Rsr\\ and Mun\ recognition sites are underlined. The site of crossover in the hybrids is repre- 
sented by the shift of the fully written out nucleotide sequence from top strand to bottom strand in this recognition sites. 
The amino acid sequence of the resulting hybrid proteins is given in three-letter code below each alignment. 

4.0 Brief description of the sequence identifiers 

[0035] SEQ ID NO:1 shows the nucleotide sequence of the hybrid protoxin gene SN15. 

[0036] SEQ ID NO:2 shows the amino acid sequence of the protein encoded by the gene SN15 shown in SEQ ID 
NO:1. 

[0037] SEQ ID NO:3 shows the nucleotide sequence of the hybrid protoxin gene SN19. 

[0038] SEQ ID NO:4 shows the amino acid sequence of the protein encoded by the gene SN19 shown in SEQ ID 
NO:3. 

[0039] SEQ ID NO:5 shows the nucleotide sequence of the modified crylBa gene as used in expression vector 
pSN17. 

[0040] SEQ ID NO:6 shows the amino acid sequence of the protein encoded by the crylBa gene shown in SEQ ID 
NO:5. 

[0041] SEQ ID NO:7 shows the nucleotide sequence of the modified cry 11a gene as used in expression vector pSN 18. 

[0042] SEQ ID NO:8 shows the amino acid sequence of the protein encoded by the crylla gene shown in SEQ ID 
NO:7. 

[0043] SEQ ID NO:9 shows the nucleotide sequence of the oligomer used for mutagenesis of the crylBa gene. 

[0044] SEQ ID NO: 10 shows the nucleotide sequence of the oligomer used for mutagenesis of the crylla gene. 

5.0 Examples 

5.1 DNA manipulations 

[0045] AH recombinant DNA techniques are as described by Ausubel et al. (1997). Mutagenesis, restriction enzyme 
digestion and ligation are performed according the instructions of the manufacturers. DNA sequencing is performed 
by the dideoxytriphosphate methode with fluorescent dyes attached to the dideoxynucleotides. Analysis is automated 
by using an Applied Biosystems 370A nucleotide sequence analyzer. 

5.2 Expression vectors 

[0046] All Cry protein expression vectors are based on pBD12, a derivative of pKK233-2 (Bosch et a/., 1994). For 
expression of Cry3Aa protein, the full cry3Aa gene is cloned into pBD12, giving expression plasmid pMH10. For cloning 
purposes both crylBa as well as crylla are mutagenized in order to contain a A/col-site overlapping with the start 
codon. For production of CrylBa protein, a Ncol-BstX\ (bases 1-1977) fragment of cryICa in pBD150 (Bosch et at., 
1994) is replaced by the corresponding fragment of crylBa (bases 1-2037), resulting in crylBa expression vector 
pMH19. pMH19 therefor contains the 5*active toxin encoding part of CrylBa and 3' protoxin specific part of CryICa. 
Crylla expression vector pBD172 contains the full crylla gene with the Spel-site (base 2180) fused to the Spel-site in 
the polylinker of pBluescript SK + . 

5.3 Mutagenesis of crylBa and crylla 

[0047] In order to be able to directly exchange the domain (II encoding regions between crylBa and crylla, a new 
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common restriction enzyme recognition site is made in both genes by site directed mutagenesis. Complementary 
mutagenic oligonucleotide pairs are used to create unique Rsrll-sites at positions 1464 and 1488 of crylBa (pMH19) 
and cry 11a (pBD172), respectively (see Fig. 1), using the using the QuickChange™ kit (Stratagene). These mutations 
do not change the encoded proteins. This results in two new expression plasmids, pSN17 (CrylBa, Seq. ID NO: 5) 
5 and pSN18 (Cry 11a, Seq. ID NO: 7), respectively (Fig. 2). The unique RsrU restriction sites at the border of the domain 
II and domain III encoding regions, together with the common Munl-sites at the border between the domain I and 
domain II encoding regions allowed simple swapping of domain encoding regions between the two genes. 

5.4 Construction of hybrid toxins 

10 

[0048] Fig. 3 schematically shows the construction of two novel hybrid toxins. Both pSN18 and pSN17 are digested 
with Nco\ and Rsril. Subsequently, the 1488 base fragment of pSN18, containing the domain I and II encoding fragment 
of cry Ha is ligated into the corresponding sites of the pSN17-derived fragment containing the 3'portion of crylBa. This 
results in plasmid pSN15 encoding a 1 la/1 la/1 Ba-hybrid (Seq. ID NO:1). The nucleotide sequence of the cross-over 
15 region with the encoded amino acid sequence is shown in Fig. 4A. Subsequently a Nco\-Mun\ (base 1-896) fragment 
encoding domain I of Cry 11a from pSN15 is replaced by the corresponding fragment encoding domain I of CrylBa, 
derived from pSN17. This results in 1Ba/1la/1 Ba-hybrid encoding plasmid pSN19 (Seq. ID. NO:3). The nucleotide 
sequence of the crossover region with the encoded amino acid sequence is shown in Fig. 4B. 

20 5.5 Protein isolation and insect bioassays. 

[0049] For large-scale production, all parental and hybrid protoxins are expressed in E. coli strain XL-1 and extracted 
as described earlier (Bosch era/., 1994). Solubilized protoxins are dialyzed overnight in 25 mM NaHC 3 , 100 mM NaCI, 
pH10. Protein concentrations are estimated by SDS-PAGE (sodium dodecylsulphate polyacrylamide gel electrophore- 

25 sis). To test toxicity to Colorado potato beetle (CPB), leaflets of greenhouse grown potato culitivar Desiree are dipped 
in toxin dilutions in water containing 0.01 % Tween-20. After drying of the leaves to the air they are transferred to petri 
dishes and 10 neonate CPB larvae are placed on each leaf. After incubation for two days at 28°C, the leaves are 
replaced by fresh leaves dipped in identical protoxin dilutions. Mortality is scored after 4 days. LC 50 (concentration with 
50% mortality) and 95% fiducial limits are determined by Probit analysis of results from three or more independent 

30 experiments, using the PoloPC computer program (Russel et ai, 1977). 

5.6 Toxicity of wild type and hybrid proteins to Colorado potato beetle 

[0050] Table 1 shows the toxicity of the hybrid proteins SN15 and SN19 against Colorado potato beetle (CPB), as 
35 compared to the parental proteins CrylBa and Cry1 la, and to the most CPB-active natural toxin available, Cry3A. 1 la/ 
1 la/1 Ba-hybrid SN15 shows slightly higher toxicity (lower LC50) than its best parental toxin, Cry 11a, on a per weight 
basis. When considering that the molecular weight of the SN15 protein is considerably larger than that of Cry 11a, it 
follows that SN15 performs even better on a per mol basis. 1Ba/1 la/1 Ba-hybrid protein SN19 is even considerably 
more toxic than SN15. 

40 

Table 1. 



Toxicity of wild type and hybrid protoxins to Colorado potato beetles. 


Protoxin 


«-C 5 o a 


95% fiducial limits a 


MW b 


Relative toxicity 0 












MH10(Cry3A) 


1.8 


1.4-2.5 


74.0 


100 


SN17(Cry1Ba) 


142 


105-198 


137.4 


1 


SN18(Cry1Ia) 


34 


23-47 


81.3 


6 


SN15 


22 


14-35 


138.0 


15 


SN19 


8 


5-11 


137.2 


42 



Concentration in microgram per milliliter of dipping solution. LC 5 q: concentration which leads to 50% mortality; 
approximate molecular weight in kiloDaltons. 

c Relative toxicity on molar basis in percents, with toxicity of Cry3A set at 100%. 



Although the present invention has been particularly described for the production of SN15 and SN1 9 hybrid toxins, the 
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skilled scientist will appreciate that other hybrid toxins with improved insecticidal characteristics may be produced 
according to the invention. Hybrid toxins containing th combination of Crylla domain II and CrylBa domain III, with 
various combinations of domain 1 and C-terminal extensions may be made. Moreover, the gene encoding SN15, SN19 
and/or other hybrids may be transferred into strains of B.t. and/or integrated into the chromosome of strains of B.t., of 

5 other bacteria or be introduced into plant genomes to provide for in situ insecticidal activity within the plant per se. In 
this regard, it is particularly preferred that the recombinant DNA encoding the toxins is modified in that sequences 
which are detrimental to high level expression in plants are removed and in that codons which are preferred by the 
plant are used. This should lead to production in the plant of a substantially similar protein to that obtained by expression 
of the unmodified recombinant DNA in the organism in which the protein components of the hybrid toxins or toxin 

10 fragments are endogenous. 
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SEQUENCE LISTING 

<110> CPRO-DLO 

<120> Bacillus thuringiensis hybrid toxins 

<130> E158870 

<140> 
<141> 

<160> 10 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 3651 

<212> DNA 1 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: hybrid 
protoxin gene SN15 

<400> 1 

atgaaactaa agaatcaaga taagcatcaa agtttttcta gcaatgcgaa agtagataaa 60 

atctctacgg attcactaaa aaatgaaaca gatatagaat tacaaaacat taatcatgaa 120 
gattgtttga aaatgtctga gtatgaaaat gtagagccgt ttgttagtgc atcaacaatt 180 

caaacaggta ttggtattgc gggtaaaata cttggtaccc taggcgttcc ttttgcagga 240 

caagtagcta gtctttatag ttttatctta ggt gage tat ggcctaaggg gaaaaatcaa 300. 

tgggaaatct ttatggaaca tgtagaagag attattaatc aaaaaatatc aacttatgea 360 

agaaataaag cacttacaga cttgaaagga ttaggagatg ccttagctgt ctaccatgat 420 

tegcttgaaa gttgggttgg aaatcgtaat aacacaaggg ctaggagtgt tgtcaagagc 480 

caatatatcg cattagaatt gatgttcgtt cagaaactac ettcttttge agtgtctgga 540 

gaggaggtac cattattacc gatatatgee caagctgeaa. atttacattt gttgetatta 600 

agagatgeat ctatttttgg aaaagagtgg ggattatcat cttcagaaat ttcaacattt 660 

tataacegtc aagtcgaacg agcaggagat tattcctacc attgtgtgaa atggtatagc 720 

acaggtctaa ataacttgag gggtacaaat gecgaaagtt gggtacgata taatcaattc 780 

cgtagagaca tgactttaat ggtactagat ttagtggcac tatttccaag ctatgataca 840 

caaatgtatc caattaaaac tacagcccaa cttacaagag aagtatatac agaegcaatt 900 

gggacagtac atccgcatcc aagttttaca agtacgactt ggtataataa taatgeaect 960 

tcgttctctg ccatagaggc tgctgtcgtt cgaaacccgc atctactcga ttttctagaa 1020 

caagttacaa tttacagctt attaagtcga tggagtaaca ctcagtatat gaatatgtgg 1080 

ggaggacata aactagaatt ccgaacaata ggaggaacgt taaatatctc aacacaagga 1140 

tctactaata cttctattaa tcctgtaaca ttaccgttca cttctcgaga cgtctatagg 1200 

actgaatcat tggcagggct gaatctattt ttaactcaac ctgttaatgg agtacctagg 1260 

gctgattttc attggaaatt cgtcacacat ccgatcgcat ctgataattt ctattatcca 1320 

gggtatgctg gaattgggac gcaattacag gattcagaaa atgaattacc acctgaagca 1380 
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acaggacagc caaattatga atcttatagt 
gcatcacatg tgaaagcatc ggtatattct 
acgattggac caaatagaat cacccaaatc 
ggtaccactg ttgttagagg accaggattt 
actggtggat ttggaccgat aagagtaact 
ataggattcc gctatgcttc aaccgtagat 
actgtaaata attttagatt cctacgtaca 
aattttgtga gacgtgcttt tactacacct 
cgaacgtcta ttcaaggcct tagtggaaat 
attccagtta ctgcaacctt cgaagcagaa 
aatgctctgt ttactaatac gaatccaaga 
attgatcaag tatccaattt agtggattgt 
cgagaattgt ccgagaaagt caaacatgcg 
caagatccaa acttcagagg gatcaataga 
gatattacca cccaaggagg agatgacgta 
accgttgatg agtgctatcc aacgtattta 
gcttataccc gttatgaatt aagagggtat 
ttgatccgtt acaatgcaaa acacgaaata 
ccgctttcag cccaaagtcc aatcggaaag 
cttgaatgga atcctgatct agattgttcc 
tcccatcatt tcaccttgga tattgatgtt 
gtatgggtga tattcaagat taagacgcaa 
tttctcgaag agaaaccatt attaggggaa 
aagtggagag acaaacgaga gaaactgcag 
aaagaatctg tagatgcttc atttgtaaac 
aacatcgcga tgattcatgc ggcagataaa 
ccagagttgt ctgtgattcc aggtgtcaat 
atttttacag cgtattcctt atatgatgcg 
aatggcttat tatgctggaa cgtgaaaggt 
cgttcggtcc ttgttatccc agaatgggag 
ccaggtcgtg gctatatcct tcgtgtcaca 
gtaacgatcc atgagatcga agacaataca 
gaggaagtat atccaaacaa cacagtaacg 
tatgagggta cgtacacttc tcctaatcaa 
tccgtaccag ctgattacgc ttcagtctat 
gagaatcctc gtgaatctaa cagaggctat 
gtaacaaagg atttagagta ctmcccagag 
acagaaggaa cattcatcgt ggatagcgtg 



catagattat ctcatatagg actcattcca 1440 
tggacgcatc gtagtgcgga ccgtacgaat 1500 
ccaatggtaa aagcatccga acttcctcaa 1560 
actggtgggg atattcttcg aagaacgaat 1620 
gttaacggac cattaacaca aagatatcgt 1680 
tttgatttct ttgtatcacg tggaggtact 1740 
atgaacagtg gagacgaact aaaatacgga 1800 
tttactttta cacaaattca agatataatt 1860 
ggggaagtgt atatagataa aattgaaatt 1920 
tatgatttag aaagagcgca agaggcggtg 1980 
agattgaaaa cagatgtgac agattatcac 2040 
ttatcagatg aattttgtct ggatgaaaag 2100 
aagcgactca gtgatgagcg gaatttactt 2160 
caaccagacc gtggctggag aggaagtaca 2220 
ttcaaagaga attacgtcac actaccgggt 2280 
tatcagaaaa tagatgagtc gaaattaaaa 2340 
atcgaagata gtcaagactt agaaatctat 2400 
gtaaatgtgc caggcacggg ttccttatgg 2460 
tgtggagaac cgaatcgatg cgcgccacac 2520 
tgcagagacg gggaaaaatg tgcacatcat 2580 
ggatgtacag acttaaatga ggacttaggt 2640 
gatggccatg caagactagg gaatctagag 2700 
gcactagctc gtgtgaaaag agcggagaag 2760 
ttggaaacaa atattgttta taaagaggca 2820 
tctcaatatg atagattaca agtggatacg 2880 
cgcgttcata gaatccggga agcgtatctg 2940 
gcggccattt tcgaagaatt agagggacgt 3000 
agaaatgtca ttaaaaatgg cgatttcaat 3060 
catgtagatg tagaagagca aaacaaccac 3120 
gcagaagtgt cacaagaggt tcgtgtctgt 3180 
gcatataaag agggatatgg agagggctgc 3240 
gacgaactga aattcagcaa ctgtgtagaa 3300 
tgtaataatt atactgggac tcaagaagaa 3360 
ggatatgacg aagcctatgg taataaccct 3420 
gaagaaaaat cgtatacaga tggacgaaga 3480 
ggggattaca caccactacc ggctggttat 3540 
accgataagg tatggattga gatcggagaa 3600 
gaattactcc ttatggagga a 3651 



<2i0> 2 
<211> 1217 

< 2 : 2 > ?i\T 

<213> Artificial Sequence 
■:22 3> 

<22 3> Description of Artificial Sequence: protein 
encoded by the gene SMI 5 
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<400> 2 

Met Lys Leu Lys 
1 

Lys Val Asp Lys 
20 

Glu Leu Gin Asn 
' 35 

Glu Asn Val Glu 
50 

Gly lie Ala Gly 
65 

Gin Val Ala Ser 



Gly Lys Asn Gin 
100 

Asn Gin Lys lie 
115 

Lys Gly Leu Gly 
130 

Trp Val Gly Asn 
145 

Gin Tyr lie Ala 



Ala Val Ser Gly 
180 

Ala Asn Leu His 
195 

Glu Trp Gly Leu 
210 

Val Glu Arg Ala 
225 

Thr Gly Leu Asn 



Asn Gin Asp Lys 
5 

He Ser Thr Asp 



He Asn His Glu 
40 

Pro Phe Val Ser 
55 

Lys He Leu Gly 
70 

Leu Tyr Ser Phe 
85 

Trp Glu He Phe 



Ser Thr Tyr Ala 
120 

Asp Ala Leu Ala 
135 

Arg Asn Asn Thr 
150 

Leu Glu Leu Met 
165 

Glu Glu Val Pro 



Leu Leu Leu Leu 
200 

Ser Ser Ser Glu 
215 

Gly Asp Tyr Ser 
230 

Asn Leu Arg Gly 



His Gin Ser Phe 
10 

Ser Leu Lys- Asn 
25 

Asp Cys Leu Lys 



Ala Ser Thr He 
60 

Thr Leu Gly Val 
75 

lie Leu Gly Glu 
90 

Met Glu His Val 
105 

Arg Asn Lys Ala 



Val Tyr His Asp 
140 

Arg Ala Arg Ser 
155 

Phe Val Gin Lys 
•170 

Leu Leu Pro He 
185 

Arg Asp Ala Ser 



lie Ser Thr Phe 
220 

Tyr His Cys Val 
235 

Thr Asn Ala Glu 



Ser Ser Asn Ala 
15 

Glu Thr Asp He. 
30 

Met Ser Glu Tyr 
45 

Gin Thr Gly He 



Pro Phe Ala Gly 
80 

Leu Trp Pro Lys 
95 

Glu Glu He He 
110 

Leu Thr Asp Leu 
125 

Ser Leu Glu Ser 



Val Val Lys Ser 
160 

Leu Pro Ser Phe 
175 

Tyr Ala Gin Ala 
190 

He Phe Gly Lys 
205 

Tyr Asn Arg Gin 



Lys Trp Tyr Ser 
240 

Ser Trp Val Arg 
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245 250 255 

Tyr Asn Gin Phe Arg Arg Asp Met Thr Leu Met Val Leu Asp Leu Val 
260 265 270 

Ala Leu Phe Pro Ser Tyr Asp Thr Gin Met Tyr Pro lie Lys Thr Thr 
275 280 285 

Ala Gin Leu Thr Arg Glu Val Tyr Thr Asp Ala lie Gly Thr Val His 
290 295 300- 

Pro His Pro Ser Phe Thr Ser Thr Thr Trp Tyr Asn Asn Asn Ala Pro 
305 310 315 320 

Ser Phe Ser Ala lie Glu Ala Ala Val Val Arg Asn Pro His Leu Leu 
325 330 335 

Asp Phe Leu Glu Gin Val Thr He Tyr Ser Leu Leu Ser Arg Trp Ser 
340 345 350 

Asn Thr Gin Tyr Met Asn Met Trp Gly Gly His Lys Leu Glu Phe Arg 
355 360 365 

Thr He Gly Gly Thr Leu Asn He Ser Thr Gin Gly Ser Thr Asn Thr 
370 375 380 

Ser He Asn Pro Val Thr Leu Pro Phe Thr Ser Arg Asp Val Tyr Arg 
385 390 395 400 

Thr Glu Ser Leu Ala Gly Leu Asn Leu Phe Leu Thr Gin Pro Val Asn 
405 410 415 

Gly Val Pro Arg Val Asp Phe His Trp Lys Phe Val Thr His Pro He 
420 425 430 

Ala Ser Asp Asn Phe Tyr Tyr Pro Gly Tyr Ala Gly He Gly Thr Gin 
435 440 445 

Leu Gin Asp Ser Glu Asn Glu Leu Pro Pro Glu Ala Thr Gly Gin Pro 
450 455 460 

Asn Tyr Glu Ser Tyr Ser His Arg Leu Ser His He Gly Leu He Ser 
465 470 475 480 

Ala Ser His Val Lys Ala Ser Val Tyr Ser Trp Thr His Arg Ser Ala 
485 490 495 

Asp Arg Thr Asn Thr He Gly Pro Asn Arg He Thr Gin He Pro Met 
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500 



505 



510 



Val Lys Ala Ser Glu Leu Pro Gin Gly Thr Thr Val Val Arg Gly Pro 
515 520 525 

Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Asn Thr Gly Gly Phe 
530 535 540 

Gly Pro He Arg Val Thr Val Asn Gly Pro Leu Thr Gin Arg Tyr Arg 
545 550 555 560- 

Ile Gly Phe Arg Tyr Ala Ser Thr Val Asp Phe Asp Phe Phe Val Ser 
565 570 575 

Arg Gly Gly Thr Thr Val Asn Asn Phe Arg Phe Leu Arg Thr Met Asn 
580 5:85 590 

Ser Gly Asp Glu Leu Lys Tyr Gly Asn Phe Val Arg Arg Ala Phe Thr 
595 600 605 

Thr Pro Phe Thr Phe Thr Gin He Gin Asp He He Arg Thr Ser lie 
610 615 620 

Gin Gly Leu Ser Gly Asn Gly Glu Val Tyr He Asp Lys He Glu He 
625 630 635 640 

He Pro Val Thr Ala Thr Phe Glu Ala Glu Tyr Asp Leu Glu Arg Ala 
645 650 655 

Gin Glu Ala Val Asn Ala Leu Phe Thr Asn Thr Asn Pro Arg Arg Leu 
660 665 670 

Lys Thr Asp Val Thr Asp Tyr His He Asp Gin Val Ser Asn Leu Val 
67 5 680 685 

Asp Cys Leu Ser Asp Glu Phe Cys Leu Asp Glu Lys Arg Glu Leu Ser 
690 695 700 

Glu Lys Val Lys His Ala Lys Arg Leu Ser Asp Glu Arg Asn Leu Leu 
705 710 715 720 

Gin Asp Pro Asn Phe Arg Gly He Asn Arg Gin Pro Asp Arg Gly Trp 
725 730 735 

Arg Gly Ser Thr Asp He Thr He Gin Gly Gly Asp Asp Val Phe Lys 
740 745 750 



Glu Asn Tyr Val Thr Leu Pro Gly Thr Val Asp Glu Cys Tyr Pro Thr 
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755 



760 



765 



Tyr Leu Tyr Gin Lys He Asp Glu Ser Lys Leu Lys Ala Tyr Thr Arg 
770 775 780 

Tyr Glu Leu Arg Gly Tyr He Glu Asp Ser Gin Asp Leu Glu He Tyr . 
785 790 795 800 

Leu lie Arg Tyr Asn Ala Lys Kis Glu He Val Asn Val Pro Gly Thr 
805 810 815 

Gly Ser Leu Trp Pro Leu Ser Ala Gin Ser Pro He Gly Lys Cys Gly 
820 825 830 

Glu Pro Asn Arg Cys Ala Pro His Leu Glu Trp Asn Pro Asp Leu Asp 
835 840 845 

Cys Ser Cys Arg Asp Gly Glu Lys Cys Ala His His Ser His His Phe 
850 855 860 

Thr Leu Asp He As,p Val Gly Cys Thr Asp Leu- Asn Glu Asp Leu Gly 
865 870 875 . 880 

Val Trp Val He Phe Lys He Lys Thr Gin Asp Gly His Ala Arg Leu 
885 890 895 

Gly Asn Leu Glu Phe Leu Glu Glu Lys Pro Leu Leu Gly Glu Ala Leu 
900 905 910 

Ala Arg Val Lys Arg Ala Glu Lys Lys Trp Arg Asp Lys Arg Glu Lys 
915 920 925 

Leu Gin Leu Glu Thr Asn He Val Tyr Lys Glu Ala Lys Glu Ser Val 
930 935 940 

Asp Ala Leu Phe Val Asn Ser Gin Tyr Asp Arg Leu Gin ' Val Asp Thr 
94 5 950 ■ 95:5 9-60 

Asn He Ala Met lie His Ala Ala Asp Lys Arg Val His Arg He Arg 
965 970 975 

Glu Ala Tyr Leu Pro Glu Leu Ser Val He Pro Gly Val Asn Ala Ala 
980 985 990 

He Phe Glu Glu Leu Glu Gly Arg He Phe Thr Ala Tyr Ser Leu Tyr 
995 1000 1005 



Asp Ala Arg Asn Val He Lys Asn Gly Asp Phe Asn Asn Gly Leu Leu 
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<220> 

<223> Description of Artificial Sequence: hybrid 
protoxin gene SN19 

<4 00> 3 

atggcttcaa ataggaaaaa tgagaatgaa attataaatg ctgtatcgaa tcattccgca 60 
caaatggatc tattaccaga tgctcgtatt gaggatagct tgtgtatagc cgaggggaac 120 
aatatcgatc catttgttag cgcatcaaca gtccaaacgg gtattaacat agctggtaga 180 
atactaggcg tattgggcgt accgtttgct ggacaactag ctagttttta tagttttctt 240 
gttggtgaat tatggccccg cggcagagat cagtgggaaa ttttcctaga acatgtcgaa 3.00 
caacttataa atcaacaaat aacagaaaat gctaggaata cggctcttgc tcgattacaa 360 
ggtttaggag attccttcag agcctatcaa cagtcacttg aagattggct agaaaa-ccgt 420 
gatgatgcaa gaacgagaag tgttctttat acccaatata tagctttaga acttgatttt 4 80 
cttaatgcga tgccgctttt cgcaattaga aaccaagaag ttccattatt gatggtatat 540 
gctcaagctg caaatttaca cctattatta ttgagagatg cctctctttt tggtagtgaa 600 
tttgggctta catcgcagga aattcaacgc tattatgagc gccaagtgga acgaacgaga 660. 
gattattccg actattgcgt agaatggtat aatacaggtc taaatagctt gagagggaca 720 
aatgccgcaa gttgggtacg gtataatcaa ttccgtagag, atctaacgtt aggagtatta 780 
gatctagtgg cactattccc aagctatgac actcgcactt atccaataaa tacgagtgct 840 
cagttaacaa gagaagttta tacagacgca attgggacag tacatccgca tccaagtttt 900 
acaagtacga cttggtataa taataatgca ccttcgttct ctgccataga ggctgctgtt 960 
gttcgaaacc cgcatctact cgattttcta gaacaagtta caatttacag cttattaagt 1020 
cgatggagta acactcagta tatgaatatg tggggaggac ataaactaga attccgaaca 1080 
ataggaggaa cgttaaatat ctcaacacaa ggatctacta atacttctat taatcctgta 1140 
acattaccgt tcacttctcg agacgtctat a.gga<:tgaat cattggcagg gctgaatcta 1200. 
tttttaactc aacctgttaa tggagtacct agggttgatt ttcattggaa attcgtcaca 1260 
catccgatcg catctgataa tttctattat ccagggtatg ctggaattgg gacgcaatta 1320 
caggattcag aaaatgaatt accacctgaa gcaacaggac agccaaatta tgaatcttat 1380 
agtcatagat tatctcatat aggactcatt tcagcatcac atgtgaaagc atcggtatat 14 4 0 
tcttggacgc atcgtagtg.c ggaccgtacg aatacgattg gaccaaatag aatcacccaa 1500 
atcccaatgg taaaagcatc cgaacttcct caaggtacca ctgttgttag aggaccagga 1560 
tttactggtg gggatattct tcgaagaacg aatactggtg gatttggacc gataagagta 1620 
actgttaacg gaccattaac acaaagatat cctataggat tccgctatgc ttcaactgta 1680 
gattttgatt tctttgtatc acgtggaggt actactgtaa ataattttag attcctacgt 1740 
acaatgaaca gtggagacga actaaaatac ggaaattttg tgagacgtgc ttttactaca 1800 
ccttttactt ttacacaaat tcaagatata attcgaacgt ctattcaagg ccttagtgga I860- 
aatggggaag tgtatataga taaaattgaa attattccag ttactgcaac cttcgaagca 1920 
gaatatgatt tagaaagagc gcaagaggcg- gtgaatgctc tgtttactaa tacgaatcca 1980 
agaagattga aaacagatgt gacagattat catattgatc aagtatccaa tttagtggat 2040 
tgtttatcag atgaattttg tctggatgaa aagcgagaat tgtccgagaa agtcaaacat 2100 
gcgaagcgac tcagtgatga gcggaattta cttcaagatc caaacttcag agggatcaat 2160 
agacaaccag accgtggctg gagaggaagt acagatatta ccatccaagg aggagatgac 2220 
gtattcaaag agaattacgt cacactaccg ggtaccgttg atgagtgcta tccaacgtat 2280 
ttatatcaga aaatagatga gtcgaaatta aaagcttata cccgttatga attaagaggg 2340 
tatatcgaag atagtcaaga cttagaaatc tatttgatcc gttacaatgc aaaacacgaa 2400 
atagtaaatg tgccaggcac gggttcctta tggccgcttt cagcccaaag tccaatcgga 24 60 
aagtgtggag aaccgaatcg atgcgcgcca caccttgaat ggaatcctga tctagattgt 2520 
tcctgcagag acggggaaaa atgtgcacat cattcccatc attt.cacctt g^atactgat 25£0 
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gttggatgta cagacttaaa tgaggactta 
caagatggcc atgcaagact agggaatcta 
gaagcactag ctcgtgtgaa aagagcggag 
cagttggaaa caaatattgt ttataaagag 
aactctcaat atgatagatt acaagtggat 
aaacgcgttc atagaatccg ggaagcgtat 
aatgcggcca ttttcgaaga attagaggga 
gcgagaaatg tcattaaaaa tggcgatttc 
ggtcatgtag atgtagaaga gcaaaacaac 
gaggcagaag tgtcacaaga ggttcgtgtc 
acagcatata aagagggata tggagagggc 
acagacgaac tgaaattcag caactgtgta 
acgtgtaata attatactgg gactcaagaa 
caaggatatg acgaagccta tggtaataac. 
tatgaagaaa aatcgtatac agatggacga 
tatggggatt acacaccact accggctggt 
gagaccgata aggtatggat tgagatcgga 
gtggaattac tccttatgga ggaa 



ggtgtatggg tgatattcaa gattaagacg 2 640 
gagtttctcg aagagaaacc attattaggg 2700 
aagaagtgga gagacaaacg agagaaactg 2760 
gcaaaagaat ctgtagatgc tttatttgta 2820 
acgaacatcg cgatgattca tgcggcagat 2880 
ctgccagagt tgtctgtgat tccaggtgtc 2940 
cgtattttta cagcgtattc cttatatgat 3000 
aataatggct tattatgctg gaacgtgaaa 3060 
caccgttcgg tccttgttat cccagaatgg 3120 
tgtccaggtc gtggctatat ccttcgtgtc 3180 
tgcgtaacga tccatgagat cgaagacaat 3240 
gaagaggaag tatatccaaa caacacagta 3300 
gaatatgagg gtacgtacac ttctcgtaat 3360 
ccttccgtac cagctgatta cgcttcagtc 3420 
agagagaatc cttgtgaatc taacagaggc 3480 
tatgtaacaa aggatttaga gtacttccca 3540 
gaaacagaag gaacattcat cgtggatagc 3600 

3624 



<210> 4 
<211> 1208 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: protein 
encoded by the gene SN19 

<400> 4 

Met Ala Ser Asn Arg Lys Asn Glu Asn Glu lie He Asn Ala Val Ser 
15 10 15 

Asn His Ser Ala Gin Met Asp Leu Leu Pro Asp Ala Arg He Glu Asp 
20 25 30 

Ser Leu Cys He Ala Glu Gly Asn Asn He Asp Pro Phe Val Ser Ala 
35 40 45 

Ser Thr Val Gin Thr Gly He Asn He Ala Gly Arg lie Leu Gly Val 
50 55 60 

Leu Gly Val Pro Phe Ala Gly Gin Leu Ala Ser Phe Tyr Ser Phe Leu 
55 70 75 80 

Val Gly Glu Leu Trp Pro Arg Gly Arg Asp Gin Trp Glu He Phe Leu 
85 SO 95 
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Glu His Val Glu Gin Leu lie Asn Gin Gin lie Th-r Glu Asn Ala Arg 
100- 105 110 

Asn Thr Ala Leu Ala Arg Leu Gin Gly Leu Gly Asp Ser Phe Arg Ala 
115 120 125 

Tyr Gin Gin Ser Leu Glu Asp Trp Leu Glu Asn Arg Asp Asp Ala Arg 
130 135- 1.40 

Thr Arg Ser Val Leu Tyr Thr Gin Tyr lie Ala Leu Glu Leu Asp Phe 
145 150 155 160 

Leu Asn Ala Met Pro Leu Fhe Ala lie Arg Asn Gin Glu Val Pro Leu 
165 170 175 

Leu Met Val Tyr Ala Gin Ala Ala Asn Leu His Leu Leu Leu Leu Arg 
180 185 190 

Asp Ala Ser Leu Phe Gly Ser Glu Phe Gly Leu Thr Ser Gin Glu lie 
195 200 205 

Gin Arg Tyr Tyr Glu Arg Gin Val Glu Arg Thr Arg Asp Tyr Ser Asp 
210 215 220 

Tyr Cys Val Glu Trp Tyr Asn Thr Gly Leu- Asn Ser -Leu Arg Gly Thr 
225 230 235 240 

Asn Ala Ala Ser Trp Val Arg Tyr Asn .Gin Phe Arg Arg Asp Leu Thr 
245 250 255 

Leu Gly Val Leu Asp Leu Val Ala Leu Phe Pro Ser Tyr Asp Thr Arg 
260 265 27a 

Thr Tyr Pro lie Asn Thr Ser Ala Gin Leu Thr Arg Glu Val Tyr Thr 
275 280 235 

Asp Ala He Gly Thr Val His Pro His Pro Ser Phe Thr Ser Thr Thr 
290 295 300 

Tr? Tyr Asn Asn- Asn Ala- Pro Ser Phe Ser Ala He Glu Ala Ala Val 
305 310 315 320 

Val Arg Asn Pro His Leu Leu Asp Phe Leu Glu Gin Val Thr He Tyr 
325 330 335 

Ser Leu Leu Ser Arg Trp Ser Asn Thr Gin Tyr Met Asn Met Trp Gly 
340 345 350 
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10 



15 



20 



25 



30 



45 



50 



Gly His Lys Leu Glu Phe Arg Thr lie Gly Gly Thr Leu Asn lie Ser 
355 360 365 

Thr Gin Gly Ser Thr Asn-" Thr Ser lie Asn- Pro- Val Thr Leu Pro Phe 
370" 375 380 

Thr Ser Arg Asp Val Tyr Arg Thr Glu Ser Leu Ala Gly Leu Asn Leu 
3S5 390 395 400 

Phe Leu Thr Gin Pro Val Asn Gly Val Pro Arg Val Asp Phe His Trp 
405 410* 415 

Lys Phe Val Thr His Pro lie Ala Ser Asp Asn Phe Tyr Tyr Pro Gly 
420 425 4.30- 

Tyr Ala Gly lie Gly Thr Gin Leu Gin Asp Ser Glu Asn Glu Leu Pro 
435 440 445 

Pro Glu Ala Thr Gly Gin Pro Asn Tyr Glu Ser .Tyr Ser His Arg Leu 
450 455 460 

Ser His lie Gly Leu lie Ser Ala Ser His Val Lys Ala Ser Val Tyr 
465 470 475 480 

Ser Trp Thr His Arg Ser Ala Asp Arg Thr Asn Thr lie Gly Pro Asn. 

485 490 495 



Arg He Thr Gin He Pro Met Val Lys Ala Ser Glu Leu Pro Gin Gly 
35 500 505 510 

Thr Thr Val Val Arg. Gly Pro -G-ly Phe Thr Gly Gly Asp He Leu Arg 
515 520 525 

40 

Arg Thr Asn Thr Gly Gly Phe Gly Pro He Arg Val Thr Val Asn Gly 
530 535 540 

Pro Leu Thr Gin Arg Tyr Arg He Gly Phe Arg Tyr Ala Ser Thr Val 
545 550 555 560 

Asp Phe Asp Phe Phe Val Ser Arg Gly Gly Thr Thr Val Asn Asn Phe 
565 570 575 

Arg Phe Leu Arg Thr Met Asn Ser Gly Asp Glu Leu Lys Tyr Gly Asn 
580 58-5 590 

55 Phe Val Arg Arg Ala Phe Thr Thr Pro Phe" Thr Phe Thr Gin He Gin 

595 600 605 
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Asp lie lie Arg Thr Ser lie Gin Gly Leu Ser Gly Asn Gly Glu Val 

610 615 620 

Tyr lie Asp Lys He Glu He He Pro Val Thr Ala Thr Phe Glu Ala 

625 630 635 640 



10 



Glu Tyr Asp Leu Glu Arg Ala Gin Glu Ala Val Asn Ala Leu Phe Thr 
645 650 655 



15 



Asn Thr Asn Pro Arg Arg Leu Lys Thr Asp Val Thr As.p Tyr His He 
660 665 670 

Asp Gin Val Ser Asn Leu Val Asp Cys Leu Ser Asp Glu Phe Cys Leu 
675 680 685 



20 



Asp Glu Lys Arg Glu Leu Ser Glu Lys Val Lys His Ala Lys Arg Leu 
690 695 700 



25 



Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn Phe Arg Gly He Asn 
705 710 715 720 



Arg Gin Pro Asp Arg Gly Trp Arg Gly Ser Thr Asp He Thr He Gin 
725 730= 735 



30 



Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val Thr Leu Pro Gly Thr 
740 745 750 



35 



Val Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin Lys He Asp Glu Ser 
755 760 765 



40 



Lys Leu Lys Ala Tyr Thr Arg Tyr Glu Leu Arg Gly Tyr He Glu Asp 

770 775 780 

Ser Gin Asp Leu Glu He Tyr Leu lie Arg Tyr Asn Ala Lys His Glu 

785 790 795 800 



45 



He Val Asn Val Pro Gly Thr Gly Ser Leu Trp Pro Leu Ser Ala Gin 
805 810 • 815 



50 



Ser Pro He' Gly Lys Cys Gly Glu Pro Asn Arg Cys Ala Pro His Leu 
820 825 830 

Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg Asp Gly Glu Lys Cys 
835 840 845 



55 



Ala His His Ser His His Phe Thr Leu Asp He Asp Val Gly Cys Thr 
850 855 860 
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Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie Phe Lys lie Lys Thr 
865 870 875 880 

Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu Phe Leu Glu Glu Lys 
885 890 895 

Pro Leu Leu Gly Glu Ala Leu Ala Arg Val Lys Arg Ala Glu Lys Lys 
900 905 910 

Trp Arg Asp Lys Arg Glu Lys Leu Gin Leu Glu Thr Asn lie Val Tyr 
915 920 925 

Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe Val Asn Ser Gin Tyr 
930 935 940 

Asp Arg Leu Gin Val Asp Thr Asn lie Ala Met lie His Ala Ala Asp 
945 950 955 960 

Lys Arg Val His Arg lie Arg Glu Ala Tyr Leu Pro Glu Leu Ser Val 
965 970 ' 975 

lie Pro Gly Val Asn Ala Ala lie Phe Glu Glu Leu Glu Gly Arg lie 
980 985 990 

Phe Thr Ala Tyr Ser Leu Tyr Asp Ala Arg Asn Val lie Lys Asn Gly 
995 1000- 10-05 

Asp Phe Asn Asn Gly Leu Leu Cys Trp 'Asn Val Lys Gly His Val Asp 
1010 1015 1020 

Val Glu Glu Gin Asn Asn His Arg Ser Val Leu Val lie Pro Glu Trp 
1025 1030 1035 1040 

Glu Ala Glu Val Ser Gin Glu Val Arg Val Cys Pro Gly Arg Gly Tyr 
1045 1050 1055 

lie Leu Arg Val Thr Ala Tyr Lys Glu Gly Tyr Gly Glu Gly Cys Val 
1-060- 1065 1070 

Thr He His Glu He Glu Asp Asn Thr Asp Glu Leu Lys Phe Ser Asn , 
1075 1080 1085 

Cys Val Glu Glu Glu Val Tyr Pro Asn Asn Thr Val Thr Cys Asn Asn 
1090 1095 1100 

Tyr Thr Gly Thr Gin Glu Glu Tyr Glu Gly Thr Tyr Thr Ser Arg Asn 
1105 1H0 1115' 1120 
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Gin Gly Tyr Asp 61 u Ala Tyr Gly Asn Asn Pro Ser Val Pro Ala Asp 
1125 1130 1135 

Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg Glu 
1140 1145 .1150 

Asn Pro Cys Glu Ser Asn Arg Gly Tyr Gly Asp Tyr Thr Pro Leu Pro 
1155 1160- 1165 

Ala Gly Tyr Val Thr Lys Asp Leu Glu Tyr Phe Pro Glu Thr Asp Lys 
1170 117-5 H80 

Val Trp He Glu He Gly Glu Thr Glu Gly Thr Phe He Val Asp Ser 
1185 1190 1195 1200 

Val Glu Leu Leu Leu Met Glu Glu 
1205 



<210> 5 
<211> 3627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
crylBa gene 

<400> 5 

atgacttcaa ataggaaaaa tgagaatgaa 
caaatggatc tattaccaga tgctcgtatt 
aatatcgatc catttgttag. cgcatcaaca 
atactaggcg tattgggcgt accgtttgct 
gttggtgaat tatggccccg cggcagagat 
caacttataa atcaacaaat aacagaaaat 
ggtttaggag attccttcag agcctatcaa 
gatgatgcaa gaacgagaag tgttctttat 
cttaatgcga tgccgctttt cgcaattaga 
gctcaagctg caaatttaca cctattatta 
tttgggctta catcgcagga aattcaacgc 
gattattccg actattgcgt agaatggtat 
aatgccgcaa- gttgg-gtacg gtata-atcaa 
gatctagtgg cactattccc aagctatgac 
cagttaacaa gagaagttta tacagacgca 
atgaattggt ataataataa tgcacctccg 
agcccgcatc tacttgattt tctagaacaa 
agtaatacta ggcatatgac ttattggcgg 



Sequence: modified 



attataaatg ctgtatcgaa tcattccgca 60 

gaggatagct tgtgtatagc cgaggggaac 120 

gtccaaacgg gtattaacat agctggtaga 180 

ggacaactag ctagttttta tagttttctt 240 

cagtgggaaa ttttcctaga acatgtcgaa 300 

gctaggaata cggctcttgc tcgattacaa 360 

cagtcacttg aagattggct agaaaaccgt 420 

acccaatata- tagctttaga acttgatttt 480 

aaccaagaag ttccattatt gatggtatat 54-0 

ttgagagatg cctctcttti: tgigta'gtgaa €0€ 

tattatgagc .gccaagtgga acgaacgaga 660 

aatacagg.tc taaatagctt gagagggaca 720 

ctccg-cag.ag. .atctaacg.tt aggagtatta 780 

actcgcactt atccaataaa tacgagtgct 840 

attggagcaa caggggtaaa tatggcaagt '900 

ttctctgcca tagaggctgc ggctatccga 960 

cttacaattt ttagcgcttc atcacgatgg 1020 

gggcacacga ttcaatctcg gccaatagga 1080 
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ggcggattaa atacctcaac gcatggggct 
cggttcgcat ctcgagacgt ttataggact 
atttaccttg aacctattca tggtgtccct 
aatatttctg atagaggtac cgctaactat 
ttaaaagatt cagaaactga attaccacca 
tacagtcaca ggttatctca tataggtata 
tatncttgga cgcatcgtag tgcggaccgt 
caaatcccaa tggtaaaag.c atccgaactt 
ggatttactg gtggggatat tcttcgaaga 
gtaactgtta acggaccatt aacacaaaga 
gtagattttg atttctttgt atcacgtgga 
cgtacaatga acagtggaga cgaactaaaa 
acacctttta cttttacaca aattcaagat 
ggaaatgggg aagtgtatat agataaaatt 
gcagaatatg atttagaaag agcgcaagag 
ccaagaagat tgaaaacaga tgtgacagat 
gattgtttat cagatgaatt ttgtctggat 
catgcgaagc gactcagtga tgagcggaat 
aatagacaac cagaccgtgg ctggagagga 
gacgtattca aagagaatta cgtcacacta 
tatttatatc agaaaataga tgagtcgaaa 
gggtatatcg aagatagtca agacttagaa 
gaaatagtaa atgtgccagg cacgggttcc 
ggaaagtgtg gagaaccgaa tcgatgcgcg 
tgttcctgca gagacgggga aaaatgtgca 
gatgttggat gtacagactt aaatgaggac 
acgcaagatg gccatgcaag actagggaat 
ggggaagcac tagctcgtgt gaaaagagcg 
ctgcagttgg aaacaaatat tgtttataaa 
gtaaactctc aatatgatag attacaagtg 
gataaacgcg ttcatagaat ccgggaagcg 
gtcaatgcgg ccattttcga agaattagag 
gatgcgagaa atgtcattaa aaatggcgat 
aaaggtcatg tagatgxaga agagcaaaac 
tgggaggcag aagtgtcaca agaggttcgt 
gtcacagcat ataaagaggg atatggagag 
aatacagacg aactgaaatt cagcaactgt 
gtaacgtgta ataattatac tgggactcaa 
aaccaaggat atgacgaagc ctatggtaat 
gtctatgaag aaaaatcgta tacagatgga 
ggctatgggg attacacacc actaccggct 
ccagagaccg ataagc^atg gattgagatc 
agcgtggaat taccccttat ggaggaa 



accaatactt ctattaatcc tgtaacatta 114 0 
gaatcatatg caggagtgct tctatgggga 1200 
actgttaggt ttaattttac gaaccctcag. 1260 
agtcaacctt atgagtcacc tgggcttcaa 1320 
gaaacaacag aacgaccaaa ttatgaatct. 1380 
attttacaat ccagggtgaa tgtaccggta 1440 
acgaatacga ttggaccaaa tagaatcacc 1500 
cctcaaggta ccactgttgt tagaggacca 1560 
accaatactg gtggatttgg accgataaga 1620 
tatcgtatag gattccgcta tgcttcaact .1680 
ggtactactg taaataattt tagattccta 1740 
tacggaaatt ttgtgagacg tgcutttact 1800 
ataattcgaa cgtctattca aggccttagt 1860 
gaaattattc cagttactgc aaccttcgaa 1920 
gcggtgaatg ctctgtttac taatacgaat 1980 
tatcatattg atcaagtatc caatttagtg 2040 
gaaaagcgag aattgtccga gaaagtcaaa 2100 
ttacttcaag atccaaactt cagagggatc 2160 
agtacagata ttaccatcca aggaggagat 2220 
ccgggtaccg ttgatgagtg ctatccaacg 2280 
ttaaaagctt atacccgtta tgaattaaga 2340 
atctatttga tccgttacaa tgcaaaacac 2400 
ttatggccgc tttcagccca aagtccaatc 24 60 
ccacaccttg aatggaatcc tga,tctagat 2520 
catcattccc atcatttcac cttggatatt 2580 
ttaggtgtat gggtgatatt caagattaag 2 640 
ctagagtttc tcgaagagaa accattatta 2700 
gagaagaagt ggagagacaa acgagagaaa 2760 
gaggcaaaag aatctgtaga tgctttattt 2820 
gatacgaaca tcgcgatgat tcatgcggca 2880 
tatctgccag agttgtctgt gattccaggt 2940 
ggacgtattt ttacagcgta ttccttatat 3000 
ttcaataatg gcttattatg ctggaacgtg 3060 
aaccaccgtt cggtccttgt tatcccagaa 3120 
gtctgtccag gtcgtggcta tatccttcgt 3180 
ggcrgcgtaa cgatccatga gatcgaagac 3240 
gtagaagagg aagtatatcc aaacaacaca 3300 
gaagaatatg agggtacgta cacttctcgt 3360 
aacccttccg taccagctga ttacgcttca 3420 
cgaagagaga atccttgtga atctaacaga 3480 
ggttatgtaa caaaggattn agagtacttc 3540 
ggagaaacag aaggaaca:: catcgtggat 360'0 

3627 



<2i0> 6 
<211> 1209 
<212> ?RT 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: protein 
encoded by the modified crylBa gene 

<400> 6 

Met Thr Ser Asn Arg Lys Asri Glu Asn Glu He He Asn Ala Val Sex 
1 5 10 15 

Asn His Ser Ala Gin Met Asp Leu Leu Pro Asp Ala Arg He Glu Asp 
20 25 30 

Ser Leu Cys He Ala Glu Gly Asn Asn He Asp Pro Phe Val Ser Ala 
35 40 45 

Ser Thr Val Gin Thr Gly He Asn He Ala Gly Arg He Leu Gly Val 
50 55 60 

Leu Gly Val Pro Phe Ala Gly Gin Leu Ala Ser Phe Tyr Ser Phe Leu 
65 70 75 80 

Val Gly Glu Leu Trp Pro Arg Gly Arg Asp Gin Trp Glu lie Phe Leu 
85 90 95 

Glu His Val Glu Gin Leu He Asn Gin Gin He Thr Glu Asn Ala Arg 
100 ioo no 

Asn Thr Ala Leu Ala Arg Leu Gin Gly Leu Gly Asp Ser Phe Arg Ala 
115 120 125 

Tyr Gin Gin Ser Leu Glu Asp Trp Leu Glu Asn Arg Asp Asp Ala Arg 
130 135 140 

Thr Arg Ser Val Leu Tyr Thr Gin Tyr He Ala Leu Glu Leu Asp Phe 
145 150 155 160 

Leu Asn-Ala Met Pro Leu Phe Ala lie Arg -Asn- Gin Glu Val Pro Leu 
165 17Q 175 

Leu Met Val Tyr Ala Gin Ala Ala Asn Leu His Leu Leu Leu Leu Arg 
180 185 190 

Asp Ala Ser Leu Phe Gly Ser Glu Phe Giy Leu Thr Ser Gin Glu He 
195 200 205 

Gin Arg Tyr Tyr Glu Arg Gin Val Glu Arg Thr Arg Asp Tyr Ser Asp 
210 215 220 
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Tyr Cys Val Glu Trp Tyr Asn Thr Gly Leu Asn Ser Leu Arg Gly Thr 
225 230 235 240 

Asn Ala Ala Ser Trp Val Arg Tyr. Asn Gin Phe Arg Arg Asp Leu Thr 
245 250 255 

Leu Gly Val Leu Asp Leu Val Ala Leu Phe Pro Ser Tyr Asp Thr Arg 
260 265 270 

Thr Tyr Pro He Asn Thr Ser Ala Gin Leu Thr Arg Glu Val Tyr Thr 
275 280 285 

Asp Ala He Gly Ala Thr Gly Val Asn Met Ala Ser Met Asn Trp Tyr 
290 295 300 

Asn Asn Asn Ala Pro Ser Phe Ser Ala He Glu Ala Ala Ala He Arg 
305 310 315 320 

Ser Pro His Leu Leu Asp Phe Leu Glu Gin Leu Thr He Phe Ser Ala 
325 330 335 

Ser Ser Arg Trp Ser Asn Thr Arg His Met Thr Tyr Trp Arg Gly His 
340 345 350 

Thr He Gin Ser Arg Pro He Gly Gly Gly Leu Asn Thr Ser Thr Has 
355 360 365 

Gly Ala Thr Asn Thr Ser He Asn Pro Val Thr Leu Arg Phe Ala Ser 
370 375 380 

Arg Asp Val Tyr Arg Thr Glu Ser Tyr Ala Gly Val Leu Leu Trp Gly 
385 390 395 400 ' 

He Tyr Leu Glu Pro He His Gly Val Pro Thr Val Arg Phe Asn Phe 
405 410 415 

Thr Asn Pro Gin Asn He Ser Asp Arg Gly Thr Ala Asn Tyr Ser Gin 
420 425 430 

Pro Tyr Glu Ser Pro Gly Leu Gin Leu Lys Asp Ser Glu Thr Glu Leu 
435 440 445 

Pro Pro- Giu Thr Thr Glu Arg Pro Asn Tyr Glu Ser Tyr Ser His Arg 
450 455 460 



Leu Ser His He Gly He He Leu Gin Ser Arg Val Asn Val Pro Val 
455 470 475 480 
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Tyr Ser Trp Thr His Arg Ser Ala Asp Arg Thr Asn Thr lie Gly Pro 
485 490 495 

Asn- Arg lie Thr Gin He Pro Met Val Lys Ala Ser Glu Leu Pro Gin 
500 505 510 

Gly Thr Thr Val Val Arg, Gly Pro Gly Phe Thr Gly Gly Asp He Leu 
515 520 525 

Arg Arg Thr Asn Thr Gly Gly Phe Gly Pro He Arg Val Thr Val Asn 
530 535 ' * 540 

Gly Pro Leu Thr Gin Arg Tyr Arg He Gly Phe Arg Tyr Ala Ser Thr 
5*45 550 555 S60 

Val Asp Phe Asp Phe Phe Val Ser Arg Gly Gly Thr Thr Val Asn Asn 
565 570 575 

Phe Arg Phe Leu Arg Thr Met Asn Ser Gly Asp Glu Leu Lys Tyr Gly 
580 585 590 

Asn Phe Val Arg Arg Ala Phe Thr Thr Pro Phe Thr Phe Thr Gin He 
595 600 605 

Gin Asp He He Arg Thr Ser He Gin Gly Leu Ser Gly Asn Gly Glu 
610 615 620 

Val Tyr lie Asp Lys He Glu He He- Pro- Val- Thr Ala Thr Phe Glu 
625 630 635 640 

Ala Glu Tyr Asp Leu Glu Arg Ala Gin Glu. Ala Val Asn Ala Leu Phe 
645 650 655 

Thr Asn Thr Asn Pro Arg Arg Leu Lys Thr Asp Val Thr Asp Tyr His 
660 665 670 

He Asp Gin Val Ser -Asn Leu Val Asp Cys Leu Ser- Asp Glu Phe Cys 
675 680 685 

Leu Asp Glu Lys Arg Glu- Leu Ser Glu Lys Val Lys His Ala Lys Arg 
690 695 700 

Leu Ser Asp Glu Arg Asn Leu Leu Gin Asp Pro Asn Phe Arg Gly lie 
7 05 710 715 720 

Asn Arg Gin Pro Asp Arg Gly Trp Arg Gly Ser Thr Asp He Thr He 
7 25 730 735 
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Gin- Gly Gly Asp Asp Val Phe Lys Glu Asn Tyr Val Thr Leu Pro Gly 
740 745 750 

Thr Val Asp Glu Cys Tyr Pro Thr Tyr Leu Tyr Gin- Lys lie Asp Glu 
755 760 765 

Ser Lys Leu Lys Ala Tyr Thr Arg Tyr Glu Leu Arg Gly Tyr He Glu 
770 775 780 

Asp Ser Gin Asp Leu Glu He Tyr Leu He Arg Tyr Asn Ala Lys His 
785 790 795 800 

Glu He Val Asn Val Pro Gly Thr Gly Ser Leu Trp Pro Leu Ser Ala 
805 810. 815 

Gin Ser Pro He Gly Lys Cys Gly Glu Pro Asn Arg Cys Ala Pro His 
820 825 830 

Leu Glu Trp Asn Pro Asp Leu Asp Cys Ser Cys Arg Asp Gly Glu Lys 
835 840 845 

Cys Ala His His Ser His His Phe Thr Leu Asp He Asp Val Gly Cys 
850. 855 860 

Thr Asp Leu Asn Glu Asp Leu Gly Val Trp Val lie. Phe Lys He Lys 
865 870 875 880 

Thr Gin Asp Gly His Ala Arg Leu Gly Asn Leu Glu Phe Leu Glu Glu 
885 * 890 395 

Lys Pro Leu Leu Gly Glu Ala Leu Ala Arg Val Lys Arg Ala Glu Lys 
900 905" 910 

Lys Trp Arg Asp Lys Arg Glu Lys Leu Gin Leu Glu Thr Asn He Val 
915 920 925 

Tyr Lys Glu Ala Lys Glu Ser Val Asp Ala Leu Phe Val Asn Ser Gin 
930 935 940 

Tyr Asp Arg Leu Gin Val Asp Thr Asn He Ala Met He His Ala Ala 
945 950 955 960 

Asp Lys Arg Val His Arg He Arg Glu Ala Tyr Leu Pro Glu Leu Ser 
965 970 975 



Val He Pro Gly Val Asn Ala Ala He Phe Glu Glu Leu Glu Gly Arg 
980 985 990 
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lie Phe Thr Ala Tyr Ser Leu Tyr Asp Ala Arg Asn Val lie Lys Asn 
995 1000 1005 

Gly Asp Phe Asn Asn Gly Leu Leu Cys Trp Asn Val Lys Gly His Val 
1010 1015 1020 

Asp Val Glu Glu Gin Asn Asn His' Arg Ser Val Leu Val lie Pro Glu 
1025 1030 1035 1040 

Trp Glu Ala Glu Val Ser' Gin Glu Val Arg Val Cys Pro' Gly Arg Gly 
1045 1050 105S 

Tyr lie Leu Arg. Val Thr Ala Tyr Lys Glu Gly Tyr Gly Glu Gly Cys 
1060 1065 1070 

Val Thr He His Glu lie Glu Asp Asn Thr Asp Glu Leu Lys Phe Ser 
1075 108 0 10-8-5 

Asn Cys Val Glu Glu Glu Val Tyr Pro Asn Asn Thr Val Thr Cys Asn 
1090 1095 1100 

Asn Tyr Thr Gly Thr Gin Glu Glu Tyr -Glu Gly Thr Tyr Thr Ser Arg 
1105 1110 1115 1120 

Asn Gin Gly Tyr Asp Glu Ala Tyr Gly Asn Asn Pro Ser Val Pro Ala 
1125 1130 1135 

Asp Tyr Ala Ser Val Tyr Glu Glu Lys Ser Tyr Thr Asp Gly Arg Arg 
1140 1145 1150 

Glu Asn Pro Cys Glu Ser Asn Arg Gly Tyr Gly Asp Tyr Thr Pro Leu 
1155 1160 1165 

Pro Ala Gly Tyr Val Thr Lys Asp Leu Glu Tyr Phe Pro Glu Thr Asp 

H70 1175 "iiao 

— Ly s— Va 1— -T-r-p— I-le Giu— I-l-e- Gly Glu Thr Glu Gly— T-h r— Phe -He Val Asp 
1185 1190 1195 1200 

Ser Val Glu Leu Leu Leu Met Glu Glu 
1205 



<210> 7 
<211> 2160 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nvodifiee* " 
crylla gene 

<400> 7 

atgaaactaa agaatcaaga taagcatcaa agtttttcta gcaatgcgaa agtagataaa 60 
atctctacgg attcactaaa aaatgaaaca gatatagaat tacaaaacat taatcatgaa 120 
gattgtttga aaatgtctga gtatgaaaat gtagagccgt ttgttagtgc atcaacaatt 180 
caaacaggta ttggtattgc gggtaaaata cttggrt accc taggcgttcc ttt'tgcagga 240- 
caagtagcta gtctttatag ttttatctta ggtgagctat ggcctaa-ggg gaaaaatcaa -300 
tgggaaatct ttatggaaca tgtagaagag attattaatc aaaaaatatc aacttatgca 360 
agaaataaag cacttacaga cttgaaagga ttaggagatg ccttagctgt ctaccatgat 420 
tcgcttgaaa gttgggttgg aaatcgtaat aacacaaggg ctaggagtgt tgtcaagagc 480 
caatatatcg cattagaatt gatgttcgtt cagaaactac cttcttttgc agtgtctgga 540 
gaggaggtac cattattacc gatatatgcc caagctgcaa atttacattt gttgctatta 600 
agagatgcat ctatttttgg aaaagagtgg ggattatcat cttcagaaat ttcaacattt 660 
tataaccgtc aagtcgaacg agcaggagat tattcctacc attgtgtgaa atggtatagc 720 
acaggtctaa ataacttgag gggtacaaat gccgaaagtt gggtacgata taatcaattc .780 
cgtagagaca tgactttaat ggtactagat ttagtggcac tatttccaag ctatgataca 840* 
caaatgtatc caattaaaac tacagcccaa cttacaagag "aagtatatac agacgcaatt 900 
gggacagtac atccgcatcc aagttttaca agtacgactt ggtataataa taatgcacct 960 
tcgttctctg ccatagaggc tgctgttgtt cgaaacccgc atctactcga ttttctagaa 1020 
caagttacaa tttacagctt attaagtcga tggagtaaca ctcagtatat gaatatgtgg 1080 
ggaggacata aac'tagaatt ccgaacaata ggaggaacgt taaatatctc aacacaagga 1140 
tctactaata cttctattaa tcctgtaaca ttaccgttca cttctcgaga cgtctatagg 1200 
actgaatcat tggcagggct gaatctattt ttaactcaac ctgttaatgg agtacctagg 1260 
gttgattttc attggaaatt cgtcacacat ccgatcgcat ctgataattt ctattatcca 1320 
gggtatgctg gaattgggac gcaattacag gattcagaaa atgaattacc acctgaagca 1380 
acaggacagc caaattatga atcttatagt catagattat ctcatatagg actcatttca 1440 
gcatcacatg tgaaagcatt ggtatattct tggacgcatc gtagtgcgga ccgtacaaat 1500 
acaattgagc caaatagcat tacacaaata ccattagtaa aagctttcaa tctgtcttca 1560 
ggtgccgctg tagtgagagg accaggattt acaggtgggg atatccttcg aagaacgaat 1620 
actggtacat ttggggatat acgagtaaat attaatccac catttgcaca aagatatcgc 1680 
gtgaggattc gctatgcttc taccacagat ttacaattcc atacgtcaat taacggtaaa 1740 
gctattaatc aaggtaattt ttcagcaact atgaatagag gagaggactt agactataaa 1800 
acctttagaa ctgtaggctt taccactcca tttagctttt tagatgtaca aagtacattc 1860 
acaataggtg cttggaactt ctcttcaggt aacgaagttt atatagatag aattgaattt 1920 
gttccggtag aagtaacata tgaggcagaa tatgattttg aaaaagcgca agagaaggtt 1980 
actgcactgt ttacatctac gaatccaaga ggattaaaaa cagatgtaaa ggattatcat 2040 
attgaccagg tatcaaattt agtagagtct ctatcagatg aattctatct tgatgaaaag 2100 
agagaattat tcgagatagt taaatacgcg aagcaactcc atattgagcg taacatgtag 2160 

<210> 8 
<211> 719 
<212> PRT 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: protein 
encoded by the modified crylla gene 

<400> 8 

Met Lys Leu Lys Asn Gin Asp Lys His Gin Ser Phe Ser Ser Asn Ala 
15 10 15 

Lys Val Asp Lys lie Ser Thr Asp Ser Leu Lys Asn Glu Thr Asp lie 
20 25 30 

Glu Leu Gin Asn lie Asn His Glu Asp Cys Leu Lys Met Ser Glu Tyr 
35 40 45 

Glu Asn Val Glu Pro Phe Val Ser Ala Ser Thr lie. Gin Thr Gly He 
50 55 60 

Gly lie Ala Gly Lys He Leu Gly Thr Leu Gly Val Pro Phe Ala Gly 
65 70 75 • 80 

Gin Val Ala Ser Leu Tyr Ser Phe He Leu Gly Glu Leu Trp Pro Lys 
85 90 95 

Gly Lys Asn Gin Trp Glu He Phe Met Glu His Val Glu Glu He He 
100 105 110 

Asn Gin Lys He Ser Thr Tyr Ala Arg Asn Lys Ala Leu Thr Asp Leu 
115 120 125 

Lys Gly Leu Gly Asp Ala Leu Ala Val Tyr His Asp Ser Leu Glu Ser 
• 130 135 140 

Trp Val Gly Asn Arg Asn Asn Thr Arg Ala Arg. Ser Val Val Lys Ser 
145 150 155 160 

Gin Tyr He Ala Leu Glu. Leu Met Phe Val Gin Lys Leu Pro Ser Phe 
165 170 175 

Ala Val Ser Gly Glu Glu Val Pro Leu Leu Pro He Tyr Ala Gin Ala 
180 185 190 

Ala Asn Leu His Leu Leu Leu Leu Arg Asp Ala Ser He Phe Gly Lys 
195 200 205 

Glu Trp Gly Leu Ser Ser Ser Glu He Ser Thr Phe Tyr Asn Arg Gin 
210 215 220 
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Val Glu Arg Ala Gly Asp Tyr Ser Tyr His Cys Val Lys Trp Tyr Ser 
225 230 235 240 

Thr Gly Leu Asn Asn Leu Arg Gly Thr Asn Ala Glu Ser Trp Val Arg 
245 250 255 

Tyr Asn Gin Phe Arg Arg Asp Met Thr. Leu Met .Val Leu Asp Leu Val 
260 265 270 

Ala Leu Phe Pro Ser Tyr Asp Thr Gin Met Tyr Pro lie Lys Thr Thr 
275 280 285 

Ala Gin Leu Thr Arg Glu Val Tyr Thr Asp Ala He Gly Thr Val His 
230 295 . 300 

Pro His Pro Ser Phe Thr Ser Thr Thr Trp Tyr Asn Asn Asn Ala Pro 
305 310 315 320 

Ser Phe Ser Ala He Glu Ala Ala Val Val Arg Asn Pro His Leu Leu 
325 330 335 

Asp Phe Leu Glu Gin Val Thr He Tyr Ser Leu Leu- Ser Arg Trp Ser 
340 345 350 

Asn Thr Gin Tyr Met Asn Met Trp Gly Gly His. Lys Leu Glu Phe Arg 
355 360 365 

Thr He Gly Gly Thr Leu Asn He Ser Thr Gin Gly Ser Thr Asn Thr 
370 375 380 

Ser He Asn Pro Val Thr Leu Pro Phe Thr Ser Arg Asp Val Tyr Arg 
385 390 . 395 400 

Thr Glu Ser Leu Ala Gly Leu Asn Leu Phe Leu Thr Gin Pro Val Asn 
405 410 415 

Gly Val Pro Arg Val Asp Phe His Trp Lys Phe Val Thr His Pro lie 
420 425 430 

Ala Ser Asp Asn Phe Tyr Tyr Pro Gly Tyr Ala Gly He Gly Thr Gin • 
435 440 445 

Leu Gin Asp Ser Glu Asn Glu Leu Pro Pro Glu Ala Thr Gly Gin Pro 
450 455 460 



Asn Tyr Glu Ser Tyr Ser His Arg Leu Ser His He Gly Leu He Ser 
465 470 475 480 



31 



EP 1 099 760 A1 



Ala Ser His Val Lys Ala Leu Val Tyr Ser Trp Thr His Arg Ser Ala 
485 490 495 

Asp Arg Thr Asn Thr lie Glu Pro Asn Ser He Thr Gin He Pro Leu 
500 505 510 

Val Lys Ala Phe Asn Leu Ser Ser Gly Ala Ala Val Val Arg Gly Pro 
515 520 525 

Gly Phe Thr Gly Gly Asp He Leu Arg Arg Thr Asn Thr Gly Thr Phe 
530 535 540 

Gly Asp He Arg Val Asn He Asn Pro Pro Phe Ala Gin Arg Tyr Arg 
545 550 555 560 

Val Arg He Arg Tyr. Ala Ser Thr Thr Asp Leu Gin Phe His Thr Ser 
565 570 575 

He Asn Gly Lys Ala He Asn Gin Gly Asn Phe Ser Ala Thr Met Asn 
580 585 590 

Arg Gly Glu Asp Leu Asp Tyr Lys Thr Phe Arg Thr Val Gly Phe Thr 
595 600 605 

Thr Pro Phe Ser Phe Leu Asp Val Gin Ser Thr Phe Thr He Gly Ala 
610 615 620 

Trp Asn Phe Ser Ser Gly Asn Glu Val Tyr He Asp Arg He Glu Phe 
625 630 635 * 640 

Val Pro Val Glu Val Thr Tyr Glu Ala Glu Tyr Asp Phe Glu Lys Ala 
645 650 655 

Gin Glu Lys Val Thr Ala Leu Phe Thr Ser Thr Asn Pro Arg Gly Leu 
660 665- 670 

Lys Thr Asp Val Lys Asp Tyr His He Asp Gin Val Ser Asn Leu Val 
675 680 685 

Glu Ser Leu Ser Asp Glu Phe Tyr Leu Asp Glu Lys Arg Glu Leu Phe 
690 695 700 



Glu He Val Lys Tyr Ala Lys Gin Leu His He Glu Arg Asn Met 
705 710 715 
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<210> 9 
<211> 37 • 
<212> DNA 

<213> Artificial Sequence- 
<220> 

<223> Description of Artificial Sequence: oligomer used 
for mutagenesis of the crylBa gene 

<400> 9 

ggacgcatcg tagtgcggac cgtacgaata cgattgg 37 



<2:10> 10 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: oligomer used 
for mutagenesis .of the crylla gene 

<400> 10 

ggacgcatcg tagtgcggac cgtacaaata caattg 36 



Claims 

1 . Bacillus thuringiensis hybrid toxin fragment comprising structural domains I, II and III in this order starting from the 
N-terminal, wherein the domains are derived from at least two different Cry proteins, domain I is domain I of any 
Bacillus thuringiensis Cry protein or a part of said domain or a peptide substantially similar to said domain, domain 
It is domain II of Crylla or a part of said domain or a peptide substantially similar to said domain, and domain III 
is domain III of CrylBa or a part of said domain or a peptide substantially similar to said domain. 

2. Toxin fragment according to claim 1 , wherein domain I is domain I of Cry 1 la or Cry 1 Ba or a part of said domain or 
a peptide substantially similar to said domain . 

3. Toxin fragment according to claim 1 or 2 comprising an amino acid sequence from about amino acid 20 to about 
amino acid 641 as shown in SEQ ID NO:2 or an amino acid sequence from about amino acid 20 to about amino 
acid 632 as shown in SEQ ID NO:4. 

4. A hybrid toxin comprising the fragment of any one of the preceding claims , or a toxin at least 85% similar to such 
a hybrid toxin which has substantially similar insecticidal activity. 

5. A toxin according to claim 4 comprising an amino acid sequence as shown in SEQ ID NO:2 or SEQ ID NO:4. 

6. Pure proteins which are at least 90% identical to the toxin fragments or hybrid toxins of any of claims 1-5. 

7. Recombinant DNA comprising a sequence encoding a protein having an amino acid sequence of the proteins as 
claimed in any one of claims 1-6. 

8. Recombinant DNA according to claim 7 comprising the sequence as shown in SEQ ID N0:1 or 3 or DNA similar 
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thereto encoding a substantially similar protein. 

9. Recombinant DNA according to claim 7 comprising the nucleotide sequence from about nucleotide 170 to about 
1929 in SEQ ID N0:1 or from about 147 to about 1896 in SEQ ID NO:3. 

5 

10. Recombinant DNA according to any of claims 7-9, which further encodes a protein having herbicide resistance, 
plant growth-promoting, anti-fungal, anti-bacterial, antiviral and/or anti-nematode properties. 

11. Recombinant DNA according to any one of claims 7 to 10 which is modified in that known mRNA instability motifs 
10 or polyadenylation signals are removed and/or codons which are preferred by the organism into which the recom- 
binant DNA is to be inserted are used so that expression of the thus modified DNA in the said organism yields 
substantially similar protein to that obtained by expression of the unmodified recombinant DNA in the organism in 
which the protein components of the hybrid toxin or toxin fragments are endogenous. 

15 1 2. A DNA sequence which is complementary to one which hybridizes under stringent conditions with the DNA of any 
one of claims 7 to 11. 

13. A vector containing a DNA sequence as claimed in any one of claims 7 to 12. 

20 14. A plant or micro-organism which includes, and enables expression of, the DNA of any one of claims 7-12 or the 
vector of claim 13. 

15. Plants transformed with recombinant DNA as claimed in any one of claims 7 to 12, the progeny of such plants 
which contain the DNA stably incorporated and hereditable in a Mendelian manner, and/or the seeds of such plants 

25 and such progeny. 

1 6. Protein derived from expression of the DNA as claimed in any one of claims 7 to 1 2 and insecticidal protein produced 
by expression of the recombinant DNA within plants as claimed in claim 15. 

30 17. An insecticidal composition containing one or more of the proteins as claimed in any one of claims 1-6 and 16. 

18. A process for combatting insects which comprises exposing them to proteins or compositions as claimed in any 
one of claims 1-6, 16 and 17 or the micro-organism of claim 14. 

35 19. An extraction process for obtaining insecticidal proteins, as claimed in any one of claims 1-6 or claim 16, from 
organic material containing them comprising submitting the material to maceration and solvent extraction, char- 
acterized in that the material is a microorganism. 
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